Eecient Exploration in Reinforcement Learning

نویسنده

Sebastian B. Thrun

چکیده

Exploration plays a fundamental role in any active learning system. This study evaluates the role of exploration in active learning and describes several local techniques for exploration in nite, discrete domains, embedded in a reinforcement learning framework (delayed reinforcement). This paper distinguishes between two families of exploration schemes: undirected and directed exploration. While the former family is closely related to random walk exploration, directed exploration techniques memorize exploration-speciic knowledge which is used for guiding the exploration search. In many nite deterministic domains, any learning technique based on undirected exploration is ineecient in terms of learning time, i.e. learning time is expected to scale exponentially with the size of the state space (Whitehead, 1991b). We prove that for all these domains, reinforcement learning using a directed technique can always be performed in polynomial time, demonstrating the important role of exploration in reinforcement learning. (The proof is given for one speciic directed exploration technique named counter-based exploration.) Subsequently, several exploration techniques found in recent reinforcement learning and connectionist adap-tive control literature are described. In order to trade oo eeciently between exploration and exploitation { a trade-oo which characterizes many real-world active learning tasks { combination methods are described which explore and avoid costs simultaneously. This includes a selective attention mechanism, which allows smooth switching between exploration and exploitation. All techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation). The empirical evaluation is followed by an extensive discussion of beneets and limitations of this work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Recombinative Reinforcement Learning

A technique is presented which is suitable for function optimization in high-dimensional binary domains. The method allows an eecient parallel implementation and is based on the combination of genetic algorithms and reinforcement learning schemes. More speciically, a population of probability vectors is considered, each member corresponding to a reinforcement learning optimizer. Each probabilit...

متن کامل

Eecient Reinforcement Learning through Symbiotic Evolution

This article presents a novel reinforcement learning method called SANE (Symbiotic, Adaptive Neuro-Evolution), which evolves a population of neurons through genetic algorithms to form a neural network capable of performing a task. Symbiotic evolution promotes both cooperation and specialization, which results in a fast, eecient genetic search and prevents convergence to subopti-mal solutions. I...

متن کامل

cient Exploration In Reinforcement Learning Sebastian

متن کامل

cient Reinforcement Learning through Symbiotic

متن کامل

A comparison of direct and model-based reinforcement learning

This paper compares direct reinforcement learning (no explicit model) and model-based reinforcement learning on a simple task: pendulum swing up. We nd that in this task model-based approaches support reinforcement learning from smaller amounts of training data and eecient handling of changing goals.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1992

Eecient Exploration in Reinforcement Learning

نویسنده

چکیده

منابع مشابه

Parallel Recombinative Reinforcement Learning

Eecient Reinforcement Learning through Symbiotic Evolution

cient Exploration In Reinforcement Learning Sebastian

cient Reinforcement Learning through Symbiotic

A comparison of direct and model-based reinforcement learning

عنوان ژورنال:

اشتراک گذاری